﻿* Encoding: UTF-8.
* Setting up working directory.
CD "C:\Users\ivan.petrusek\Desktop\Replication".

* Loading the indicator data matrix with respondent IDs (file: panel_id_wave_matrix.sav).
* Saving the loaded data matrix to the file CAB_panel_wide.sav. After merging the waves, this data file will contain the required waves of the CAB survey.
GET FILE="panel_id_wave_matrix.sav".
SORT CASES BY ID.
SAVE OUTFILE="CAB_panel_wide.sav".
EXECUTE.


******************************************************.
******************** 1. MERGE********************.  
******************************************************.


***** Merging respondents and required variables from the first two waves of the Czech Attitude Barometer with the indicator data matrix with respondent IDs.

** W1 (WAVE 1) - the first wave fielded in June and July 2024 (this original full wave 1 data set is available online at Czech Social Science Data Archive: https://doi.org/10.14473/CSDA/B73GVD).
** The following set of variables from the first wave are used in our analyses or data management:
* ID = Panel ID of the respondent – anonymized (NOTE: waves are merged using this variable).
* vlna_1 = Respondent’s participation in the first panel wave (0 = no, 1 = yes).
* vlna_2 = Respondent’s participation in the second panel wave (0 = no, 1 = yes).
* w1_vzdel_2 = Abolition of the 9th grade.
* w1_IDE_2 = Respondent’s age.
* w1_IDE_8 = Gender.
* w1_VZD = Education – 3 categories.
* w1_left_right = Ideology – self-placement on the left–right scale.
* w1_pol_interest = Interest in politics.
GET FILE="CAB_panel_wide.sav".
SORT CASES BY ID.
MATCH FILES
  /FILE=*
  /TABLE="PNS 2406 CoRe CAB W1_FINAL.sav"
  /BY ID
  /KEEP ID vlna_1 vlna_2 w1_vzdel_2 w1_IDE_2 w1_IDE_8 w1_VZD w1_left_right w1_pol_interest.
EXECUTE.
SAVE OUTFILE="CAB_panel_wide.sav".

*** W2 (WAVE 2) -  the second wave fielded in September 2024 (original data set from wave 2 is available online at Czech Social Science Data Archive: https://doi.org/10.14473/CSDA/CUSNBR).
** The following set of variables from the second wave are used in our analyses or data management:
* ID = Panel ID of the respondent – anonymized (NOTE: waves are merged using this variable).
* w2_zakl_let = Number of years spent in elementary school.
* w2_child_care = Number of children being raised.
* w2_yr_brth1 = Year of birth of the 1st child being raised.
* w2_yr_brth2 = Year of birth of the 2nd child being raised.
* w2_yr_brth3 = Year of birth of the 3rd child being raised.
* w2_yr_brth4 = Year of birth of the 4th child being raised.
* w2_yr_brth5 = Year of birth of the 5th child being raised.
* w2_yr_brth6 = Year of birth of the 6th child being raised.
* w2_yr_brth7 = Year of birth of the 7th child being raised.
GET FILE="CAB_panel_wide.sav".
SORT CASES BY ID.
MATCH FILES
  /FILE=*
  /TABLE="PNS 2409 CoRe CAB W2_FINAL.sav"
  /BY ID
  /KEEP ID vlna_1 vlna_2 w1_vzdel_2 w1_IDE_2 w1_IDE_8 w1_VZD w1_left_right w1_pol_interest
  w2_zakl_let w2_child_care w2_yr_brth1 w2_yr_brth2 w2_yr_brth3 w2_yr_brth4 w2_yr_brth5 w2_yr_brth6 w2_yr_brth7.
EXECUTE.
SAVE OUTFILE="CAB_panel_wide.sav".

*** Creating an inner join from the above merge - so that only respondents who participated in both waves are part of the analysed data set.
select if vlna_1 = 1 AND vlna_2 = 1.
execute.
* The resulting merged dataset contains 1644 respondents who participated in both waves.
fre ID.

* Saving the final dataset with 1644 respondents and 18 variables.
SAVE OUTFILE='C:\Users\ivan.petrusek\Desktop\Replication\CAB_waves_1_and_2.sav'
  /COMPRESSED.

* Encoding: UTF-8.
* Setting up working directory.
CD "C:\Users\ivan.petrusek\Desktop\Replication".

* Loading the merged data from the first two waves of the Czech Attitude Barometer panel data.
GET FILE='C:\Users\ivan.petrusek\Desktop\Replication\CAB_waves_1_and_2.sav'.


***********************************************************************.
******************** 2. DATA MANAGEMENT********************.  
***********************************************************************.


***** DEPENDENT VARIABLE: To what extent do you agree or disagree with the following measures in the field of education? 
* Abolishing the ninth grade of basic school and returning to an eight-year basic school system.

* Original variable: Measure: Abolition of the ninth grade of primary school and a return to an eight-year primary school.
fre w1_vzdel_2.
* Defining Don´t know (56 respondents) and Refusals (4 respondents) as missing values.
mis val w1_vzdel_2 (98 99).
variable labels w1_vzdel_2 "Dependent variable: Abolition of the ninth grade of primary school".
value labels w1_vzdel_2
5"Strongly disagree"
4"Somewhat disagree"
3"Neither agree nor disagree"
2"Somewhat agree"
1"Strongly agree"
98"Don´t know"
99"Refusal".
fre w1_vzdel_2.

***** KEY INDEPENDENT VARIABLE (i.e. the treatment): number of years attending a basic school (based on self-reported lengths).
fre w2_zakl_let.
cro w1_IDE_2 by w2_zakl_let.

* Excluding respondents with all other answers but 8 years or 9 years from the analysis.
* This dummy coding excludes 63 respondents.
recode w2_zakl_let (5 = 0) (6 = 1) (else = sysmis) into devitka_excl.
formats devitka_excl (F1.0).
variable labels devitka_excl "9 years of basic school (1) vs. 8 years of basic school (0)".
value labels devitka_excl 
    0"8 years of basic school (i.e. did not attend basic school for 9 years)"
    1"9 years of basic school".
cro w2_zakl_let by devitka_excl.
fre devitka_excl.


***** MODRATING AND CONTROL VARIABLES.
*** Number of years since completing basic school (years_since).
fre w2_zakl_let.
recode w2_zakl_let (2 = 5) (3 = 6) (4 = 7) (5 = 8) (6 = 9) (7 = 10) (else = sysmis) into roky.
cro w2_zakl_let by roky.
* Age of respondent.
fre w1_IDE_2.

* Computing the variable - the formula assumes that respondents started basic school at the age of 6. 
compute years_since = w1_IDE_2 - roky - 6.
variable labels years_since"Number of years since completing high school".
formats years_since (F2.0).
fre years_since.


***  Year of birth (based on age variable in 2024).
fre w1_IDE_2.
compute year_birth = 2024 - w1_IDE_2.
formats year_birth (F4.0).
variable labels year_birth "Year of birth (based on age)".
fre year_birth.


***  Interest in politics: creating a dummy variable for high interest in politics.
fre w1_pol_interest.
recode w1_pol_interest (1 = 1) (else = 0) into high_interest.
formats high_interest (F1.0).
variable labels high_interest "High interest in politcs (1) vs. moderate to no interest (0)".
fre high_interest.


*** Political ideology: left-right placement.
mis val w1_left_right (98 99).
fre w1_left_right.


*** Number of children raised during their school attendance.
fre w2_child_care.

* Removing missing value definitions for this variable (so that codes for missing values can be part of if conditions.
mis val w2_child_care ().
*** Birth years of raised children (only for respondents who raised/are raising at least one such child).
* Although birth years are cardinal variables, CVVM created a coding scheme for children's birth years (this is a very non-standard procedure).
* If the attitudinal data on the abolition of 9th grade were collected in 2024, a 15-year-old child (who is potentially in the ninth grade of primary school) must have been born in 2009.
* The year 2009 corresponds to code 16 across all birth year variables (with the lowest code 1 corresponding to the year 2019).
fre w2_yr_brth1 w2_yr_brth2 w2_yr_brth3 w2_yr_brth4 w2_yr_brth5 w2_yr_brth6 w2_yr_brth7.

compute children = 1.
do if (w2_child_care = 1).
    compute children = 0.
else if (MIN(w2_yr_brth1, w2_yr_brth2, w2_yr_brth3, w2_yr_brth4, w2_yr_brth5, w2_yr_brth6, w2_yr_brth7) <= 16).
    compute children = 2.
end if.
exe.

* Transforming missing values of the original variables into system missing values for the children variable.
* If the respondent did not know (98) or refused to answer the question about the number of raised children (99), or if the values are system missing.
if (w2_child_care = 98 OR w2_child_care = 99 OR MISSING(w2_child_care)) children = $SYSMIS.
variable labels children "Children: has at least one raised child of primary school age".
value labels children 
    0"Has no (and never had any) child they cared for"
    1"Had children they cared for, but they are no longer of primary school age"
    2"Currently has at least one child of primary school age".
fre children.

*** Creating two variables used in the models.
* Has no (and never had any) child they cared for.
recode children (0 = 1) (else = 0) into no_kids.
variable labels no_kids "Dummy variable: Has no (and never had any) child they cared for".
formats no_kids (F1.0).
cro children by no_kids.

* Currently has at least one child of primary school age.
recode children (2 = 1) (else = 0) into has_kids.
variable labels has_kids "Dummy variable: Currently has at least one child of primary school age".
formats has_kids (F1.0).
cro children by has_kids.


*** Sex of respondent: Female.
recode w1_IDE_8 (2 = 1) (1 = 0) into female.
variable labels female "Dummy variabe: Sex equals to female".
formats female (F1.0).
cro w1_IDE_8 by female.


*** University education.
recode w1_VZD (3 = 1) (else = 0) into university.
variable labels university "Dummy variabe: Has a university education".
formats university (F1.0).
cro w1_VZD by university.


* * * Interaction term for H2.
compute devitka_zajem = devitka_excl * high_interest.
variable labels devitka_zajem "Interaction: Experience * High interest in politics".
formats devitka_zajem (F1.0).
fre devitka_zajem.

* * * Interaction term for H3.
compute interakce = devitka_excl * years_since.
variable labels interakce "Interaction: Experience * Years since completing basic school".
formats interakce (F1.0).
fre interakce.

* Saving the final dataset with 1644 respondents and 30 variables.
SAVE OUTFILE='C:\Users\ivan.petrusek\Desktop\Replication\CAB_waves_1_and_2.sav'
  /COMPRESSED.
